XL-WSD: An Extra-Large and Cross-Lingual Evaluation Framework for Word Sense Disambiguation

نویسندگان

چکیده

Transformer-based architectures brought a breeze of change to Word Sense Disambiguation (WSD), improving models' performances by large margin. The fast development new approaches has been further encouraged well-framed evaluation suite for English, which allowed their be kept track and compared fairly. However, other languages have remained largely unexplored, as testing data are available few only the setting is rather matted. In this paper, we untangle situation proposing XL-WSD, cross-lingual benchmark WSD task featuring sense-annotated test sets in 18 from six different linguistic families, together with language-specific silver training data. We leverage XL-WSD datasets conduct an extensive neural knowledge-based approaches, including most recent multilingual language models. Results show that zero-shot knowledge transfer across promising research direction within field, especially when considering low-resourced where pre-trained models still perform poorly. make code performing experiments at https://sapienzanlp.github.io/xl-wsd/.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-Lingual Word Sense Disambiguation

Word Sense Disambiguation using Cross-Lingual approach has been used successfully for languages like Farsi and Hindi. However, a comparable corpus in the form of Wikipedia articles available in English and Hindi has been used for such a task. This motivated us to further the approach and test the results when a parallel corpus is used. In this project, we specifically wanted to observe if the a...

متن کامل

Cross-lingual Word Sense Disambiguation: Documentation on Data and Evaluation

We propose a multilingual unsupervised Word Sense Disambiguation (WSD) task for a sample of English nouns. Instead of providing manually sense-tagged examples for each sense of a polysemous noun, our sense inventory is built up on the basis of the Europarl parallel corpus. The multilingual setup involves the translations of a given English polysemous noun in five supported languages, viz. Dutch...

متن کامل

IXA at CLEF 2008 Robust-WSD Task: using Word Sense Disambiguation for (Cross Lingual) Information Retrieval

This paper describes the participation of the IXA NLP group at the CLEF 2008 Robust-WSD Task. This is our first time at CLEF, and we participated at both the monolingual (English) and the bilingual (Spanish to English) subtasks. We tried several query and document expansion and translation strategies, with and without the use of the word sense disambiguation results provided by the organizers. ...

متن کامل

LIMSI : Cross-lingual Word Sense Disambiguation using Translation Sense Clustering

We describe the LIMSI system for the SemEval-2013 Cross-lingual Word Sense Disambiguation (CLWSD) task. Word senses are represented by means of translation clusters in different languages built by a cross-lingual Word Sense Induction (WSI) method. Our CLWSD classifier exploits the WSI output for selecting appropriate translations for target words in context. We present the design of the system ...

متن کامل

Cross-lingual Word Sense Disambiguation for Predicate Labelling of French

We address the problem of transferring semantic annotations, more specifically predicate labellings, from one language to another using parallel corpora. Previous work has transferred these annotations directly at the token level, leading to low recall. We present a global approach to annotation transfer that aggregates information across the whole parallel corpus. We show that this global meth...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i15.17609